增强业务流程管理系统(ABPMS)是一类新兴的过程感知信息系统,可利用值得信赖的AI技术。ABPMS增强了业务流程的执行,目的是使这些过程更加适应性,主动,可解释和上下文敏感。该宣言为ABPMS提供了愿景,并讨论了需要克服实现这一愿景的研究挑战。为此,我们定义了ABPM的概念,概述了ABPMS中流程的生命周期,我们讨论了ABPMS的核心特征,并提出了一系列挑战以实现具有这些特征的系统。
translated by 谷歌翻译
最先进的过程发现方法从事件日志构建自由选择流程模型。因此,构造的模型不会考虑事件之间的间接依赖关系。每当输入行为不是自由选择时,这些方法都无法提供精确的模型。在本文中,我们提出了一种通过添加非自由选择构造通过基于地区的技术发现的非自由选择构造来增强自由选择工艺模型的新方法。这使我们能够从现有的过程发现方法的性能中受益以及采用基本合成技术的准确性。我们证明,当存在间接依赖关系时,所提出的方法在提高了事件日志时保留了适应性。该方法已经在合成和实际数据集中实施和测试。结果表明其在从事件日志中修复模型的有效性。
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.
translated by 谷歌翻译
This paper focuses on the uncertainty estimation of white matter lesions (WML) segmentation in magnetic resonance imaging (MRI). On one side, voxel-scale segmentation errors cause the erroneous delineation of the lesions; on the other side, lesion-scale detection errors lead to wrong lesion counts. Both of these factors are clinically relevant for the assessment of multiple sclerosis patients. This work aims to compare the ability of different voxel- and lesion- scale uncertainty measures to capture errors related to segmentation and lesion detection respectively. Our main contributions are (i) proposing new measures of lesion-scale uncertainty that do not utilise voxel-scale uncertainties; (ii) extending an error retention curves analysis framework for evaluation of lesion-scale uncertainty measures. Our results obtained on the multi-center testing set of 58 patients demonstrate that the proposed lesion-scale measures achieves the best performance among the analysed measures. All code implementations are provided at https://github.com/NataliiaMolch/MS_WML_uncs
translated by 谷歌翻译
近年来,深度学习算法在地球观察(EO)中的应用使依赖远程感知数据的领域取得了重大进展。但是,鉴于EO中的数据量表,创建具有专家使用像素级注释的大型数据集是昂贵且耗时的。在这种情况下,先验被视为一种有吸引力的方法,可以减轻在训练EO的深度学习方法时手动标签的负担。对于某些应用,这些先验很容易获得。本研究以许多计算机视觉任务中的自我监督特征表示学习的对比学习方法取得了巨大成功的动机,本研究提出了一种使用作物标签比例的在线深度聚类方法,作为研究基于政府作物的样本级别的先验者 - 整个农业地区的比例数据。我们使用来自巴西两个不同农业地区的两个大数据集评估了该方法。广泛的实验表明,该方法对不同的数据类型(合成句子雷达和光学图像)具有鲁棒性,考虑到目标区域中主要的作物类型,报告了更高的精度值。因此,它可以减轻EO应用中大规模图像注释的负担。
translated by 谷歌翻译
在计算和数据方面,大型语言模型的预培训通常需要大量资源。经常使用的Web源(例如Common Crawl)可能包含足够的噪声,以使这种预训练的亚地区。在这项工作中,我们尝试了西班牙语版本的MC4的不同采样方法,并提出了一种新颖的以数据为中心的技术,我们将其命名为$ \ textit {Perplexity sampling} $,该技术可实现大约一半的语言模型的预培训步骤并使用五分之一的数据。最终的模型与当前的最新机构相当,甚至可以为某些任务获得更好的结果。我们的工作证明了变形金刚的多功能性,并为小型团队以有限的预算培训模型铺平了道路。我们的型号可在此$ \ href {https://huggingface.co/bertin-project} {url} $中获得。
translated by 谷歌翻译
分配转移或培训数据和部署数据之间的不匹配是在高风险工业应用中使用机器学习的重要障碍,例如自动驾驶和医学。这需要能够评估ML模型的推广以及其不确定性估计的质量。标准ML基线数据集不允许评估这些属性,因为培训,验证和测试数据通常相同分布。最近,已经出现了一系列专用基准测试,其中包括分布匹配和转移的数据。在这些基准测试中,数据集在任务的多样性以及其功能的数据模式方面脱颖而出。虽然大多数基准测试由2D图像分类任务主导,但Shifts包含表格天气预测,机器翻译和车辆运动预测任务。这使得可以评估模型的鲁棒性属性,并可以得出多种工业规模的任务以及通用或直接适用的特定任务结论。在本文中,我们扩展了偏移数据集,其中两个数据集来自具有高社会重要性的工业高风险应用程序。具体而言,我们考虑了3D磁共振脑图像中白质多发性硬化病变的分割任务以及海洋货物容器中功耗的估计。两项任务均具有无处不在的分配变化和由于错误成本而构成严格的安全要求。这些新数据集将使研究人员能够进一步探索新情况下的强大概括和不确定性估计。在这项工作中,我们提供了两个任务的数据集和基线结果的描述。
translated by 谷歌翻译
磁共振成像(MRI)是中风成像的中心方式。它被用来接受患者的治疗决定,例如选择患者进行静脉溶栓或血管内治疗。随后在住院期间使用MRI来通过可视化梗塞核心大小和位置来预测结果。此外,它可以用来表征中风病因,例如(心脏) - 栓塞和非胚胎中风之间的区分。基于计算机的自动医疗图像处理越来越多地进入临床常规。缺血性中风病变分割(ISLE)挑战的先前迭代有助于生成鉴定急性和急性缺血性中风病变分割的基准方法。在这里,我们介绍了一个专家注册的多中心MRI数据集,以分割急性到亚急性中风病变。该数据集包括400个多供应商MRI案例,中风病变大小,数量和位置的可变性很高。它分为n = 250的训练数据集和n = 150的测试数据集。所有培训数据将公开可用。测试数据集将仅用于模型验证,并且不会向公众发布。该数据集是Isles 2022挑战的基础,目的是找到算法方法,以实现缺血性中风的稳健和准确分割算法的开发和基准测试。
translated by 谷歌翻译
人类评分是分割质量的抽象表示。为了近似于稀缺专家数据的人类质量评级,我们训练替代质量估计模型。我们根据Brats注释方案评估复杂的多级分割问题,特别是神经胶质瘤分割。培训数据以15位专家神经放射科学家的质量评级为特征,范围从1到6星,用于各种计算机生成和手动3D注释。即使网络在2D图像上运行并使用稀缺的训练数据,我们也可以在与人类内部内可靠性相当的错误范围内近似分段质量。细分质量预测具有广泛的应用。虽然对分割质量的理解对于成功分割质量算法的成功临床翻译至关重要,但它可以在培训新的分割模型中发挥至关重要的作用。由于推断时间分裂,可以直接在损失函数中或在联合学习设置中作为完全自动的数据集策划机制。
translated by 谷歌翻译